Haplotype analysis in the presence of informatively missing genotype data.
نویسندگان
چکیده
It is common to have missing genotypes in practical genetic studies, but the exact underlying missing data mechanism is generally unknown to the investigators. Although some statistical methods can handle missing data, they usually assume that genotypes are missing at random, that is, at a given marker, different genotypes and different alleles are missing with the same probability. These include those methods on haplotype frequency estimation and haplotype association analysis. However, it is likely that this simple assumption does not hold in practice, yet few studies to date have examined the magnitude of the effects when this simplifying assumption is violated. In this study, we demonstrate that the violation of this assumption may lead to serious bias in haplotype frequency estimates, and haplotype association analysis based on this assumption can induce both false-positive and false-negative evidence of association. To address this limitation in the current methods, we propose a general missing data model to characterize missing data patterns across a set of two or more markers simultaneously. We prove that haplotype frequencies and missing data probabilities are identifiable if and only if there is linkage disequilibrium between these markers under our general missing data model. Simulation studies on the analysis of haplotypes consisting of two single nucleotide polymorphisms illustrate that our proposed model can reduce the bias both for haplotype frequency estimates and association analysis due to incorrect assumption on the missing data mechanism. Finally, we illustrate the utilities of our method through its application to a real data set.
منابع مشابه
Modeling Informatively Missing Genotypes in Haplotype Analysis.
It is common to have missing genotypes in practical genetic studies. The majority of the existing statistical methods, including those on haplotype analysis, assume that genotypes are missing at random-that is, at a given marker, different genotypes and different alleles are missing with the same probability. In our previous work, we have demonstrated that the violation of this assumption may l...
متن کاملAssociation of P53 (+16ins-Arg) Haplotype with the Increased Susceptibility to Breast Cancer in Iranian-Azeri Women
Background:Many case-control investigations have showed the correlation of TP53 gene polymorphisms with the risk of breast cancer. However, the findings are not consistent. It has been suggested that the investigation of P53 genotype combinations and haplotypes may be more helpful than the detection of single polymorphisms. In the present study, we investigate...
متن کاملGENECOUNTING: haplotype analysis with missing genotypes
A general algorithm is described for haplotype analysis of unrelated individuals with missing genotypes. It can handle problems involving multiple polymorphic markers with missing data.
متن کاملKiller Cell Immunoglobulin-Like Receptors (KIRs) Genotype and Haplotype Analysis in Iranians with Non-Melanoma Skin Cancers
Background: The innate immune system against malignancies is mainly orchestrated by natural killer cells, which carry out killing mechanisms by using their receptors, such as killer immunoglobulin-like receptors (KIRs). This study was designed to determine the diversity of KIR genes in non-melanoma skin cancers. Methods: A total of 160 subjects with skin cancer, including 60 cases of squamous c...
متن کاملHaplotype Effect of Two Human Leukocyte Antigen-G Polymorphisms of rs1736933 and rs2735022 on the Recurrent Pregnancy Loss
Background: Recurrent Pregnancy Loss (RPL) is a multifactorial disease that affects 1-3% of couples. Since Human Leukocyte Antigen-G (HLA-G) gene is involved in fetal maternal immune tolerance, mutations in the HLA-G gene can affect the success rate of pregnancy. Objective: The present study aims to investigate the haplotype effect of rs1736933 and rs2735022 polymorphisms found in the HLA-G ge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genetic epidemiology
دوره 30 4 شماره
صفحات -
تاریخ انتشار 2006